Add logs.sh search and analyse tools by simonsmallchua · Pull Request #373 · Good-Native/hover

simonsmallchua · 2026-05-02T01:17:55Z

Summary

Replaces monitor_logs.sh with a unified scripts/logs.sh dispatcher that has three subcommands — monitor, search, analyse — so agents can ask point-in-time questions without writing one-off bash. Bare logs.sh (no subcommand) runs monitor with all defaults, captures logs every 3s across all five Fly apps, and writes a probe-driven analysis snapshot every 5 minutes plus a final report at run end.

What changed

scripts/logs.sh (new) — dispatcher. Commands: monitor, search, analyse. A leading flag (e.g. logs.sh --interval 5) implicitly routes to monitor.
scripts/search_logs.py (new) — keyword/regex search across one or more runs, supports both live raw/ and post-cleanup raw.zip. Case-insensitive by default; --case-sensitive opts in.
scripts/analyse_logs.py (new) — runs a fixed probe set and writes analysis.{md,json}:
- severity rollup, panics & fatals, HTTP status, latency (p50/p95/p99 + slowest), heartbeat, process health, autoscaler, DB / external errors, Sentry, ad-hoc --keyword/--regex.
- every finding records count, first_seen, last_seen, and peak (timestamp of the highest-count minute).
scripts/filter_since.py (new) — per-app cursor filter piped after flyctl logs so each capture only persists lines newer than the previous iteration. No more 4× capture-overlap inflation.
scripts/monitor_logs.sh — reduced to a 7-line back-compat shim → logs.sh monitor "$@". Existing call sites in opencode.json, .claude/commands/monitor.md, .claude/settings.local.json, and docs/development/DEVELOPMENT.md keep working unchanged.

Defaults

Subcommand	Flag	Default
monitor	`--app`	hover, hover-worker, hover-analysis, hover-autoscaler-worker, hover-autoscaler-analysis
monitor	`--interval`	3s
monitor	`--iterations`	1440 (~72 min at 3s)
monitor	`--analyse-every`	5m (`0` to disable)
monitor	`--run-id`	auto `<adjective>-<colour>` slug
search	case sensitivity	off (insensitive) — `--case-sensitive` opts in
search / analyse	`--run`	`latest`

Output layout

logs/YYYYMMDD/HHMM_<slug>_<settings>/
├── <app>/raw/*.log       # cursor-filtered captures
├── <app>/.cursor         # last-seen ISO timestamp per app
├── snapshots/
│   ├── analysis_<HHMMSS>Z.md
│   └── analysis_<HHMMSS>Z.json
├── analysis.md           # final, written at run end
└── analysis.json

Verified

13-case synthetic smoke test (dispatcher, shim, snapshot cadence, case-sensitive search, regex search, --max cap, ad-hoc keyword probe, app filter, raw.zip analysis, parse_duration rejection, etc).
Real flyctl capture across all 5 hover apps. Surfaced and fixed:
- ANSI-wrapped Fly timestamp prefix not parsed → autoscaler events had empty first/last/peak.
- flyctl logs --no-tail returning the same window every call → added cursor filter.
- Slowest-latency duplicates → fixed via the same dedup.
- Autoscaler probe didn't match real "msg":"reconciling" format → updated patterns.

Test plan

./scripts/logs.sh (no args) starts a default monitor run; verify analysis.md lands at run end and snapshots accumulate every 5 min.
./scripts/logs.sh search --keyword panic --run latest returns matches with per-app counts.
./scripts/logs.sh analyse --keyword pgx adds an "Ad-hoc keywords" section to the report.
./scripts/monitor_logs.sh --help still works (back-compat shim).

Summary by CodeRabbit

New Features
- Unified CLI to monitor, search, and analyze Fly logs with per-run captures, snapshotting, optional cleanup, and final md/json reports; search supports keywords/regex across runs and zipped captures.
- Per-app incremental filtering via persisted cursors to avoid duplicate captures.
- Log analysis reporting severity/status/latency/heartbeat buckets with counts, first/last seen, peaks, and samples; optional stdout rendering.
Bug Fixes
- Improved graceful Ctrl+C handling with consistent exit codes; analysis probe errors reported as non-fatal warnings.
Refactor
- Legacy monitor script now forwards to the unified CLI.
Documentation
- Changelog and docs updated for new tooling, subcommands, defaults, and output layout.

coderabbitai · 2026-05-02T01:18:06Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
✅ Review completed - (🔄 Check again to review again)

📝 Walkthrough

Walkthrough

Adds a canonical Bash dispatcher scripts/logs.sh (monitor/search/analyse), replaces monitor_logs.sh with a shim, and introduces Python tooling: scripts/filter_since.py (cursored incremental filter), scripts/search_logs.py (run discovery + log iterator + search CLI), and scripts/analyse_logs.py (streaming, de-duplicating probe-based analysis producing per-run analysis.json/analysis.md). Entrypoint hardening added to process_logs.py and aggregate_logs.py.

Changes

Logs capture, filtering, search & analysis

Layer / File(s)	Summary
CLI Entrypoint / Dispatcher `scripts/logs.sh`, `scripts/monitor_logs.sh`	Adds `logs.sh` as the canonical dispatcher (default `monitor`) and replaces `monitor_logs.sh` with a shim that `exec`s `logs.sh monitor "$@"`.
Run layout & monitor orchestration `scripts/logs.sh`	Introduces run naming `logs/YYYYMMDD/HHMM_<slug>_<settings>/`, slug generation, `parse_duration()`, TTY-aware ticker/logging, SIGINT handling, optional cleanup (zip/delete), snapshot cadence, and final aggregation/analysis hooks.
Capture loop & per-app wiring `scripts/logs.sh` (`cmd_monitor`, `capture_app`)	Per-iteration capture runs `flyctl logs --no-tail` per app, maintains per-app `.cursor`, optionally pipes through `filter_since.py`, writes per-iteration `raw/*.log`, and invokes `process_logs.py`/`aggregate_logs.py` when Python is available.
Python helper wiring `scripts/logs.sh` (`resolve_python`, `cmd_search`, `cmd_analyse`)	Resolves a usable Python interpreter and `exec`s `search_logs.py` / `analyse_logs.py` with `PYTHONUTF8=1`, forwarding args.
Line filtering / cursor persistence `scripts/filter_since.py`	New stdin utility: strips ANSI, extracts/normalises leading ISO-like timestamps, emits only timestamped headers strictly newer than a persisted per-app cursor (suppressing continuations otherwise), updates the cursor file, and returns defined exit codes.
Run discovery & log iteration API `scripts/search_logs.py` (`resolve_runs`, `iter_lines`)	Resolves `--run` (`latest`/`all`/date/prefix) to run directories and yields `(source_label, line)` from `raw/*.log` and `raw.zip` entries, decoding UTF-8 with replacement.
Search CLI `scripts/search_logs.py` (`search`, `main`)	Search CLI composes escaped keywords/regex into one pattern, scans selected runs/apps, prints matches prefixed with `[run][app][source]`, enforces a global `--max` cap, reports per-app and total counts, and returns the match count as exit code.
Deterministic analyzer & probes `scripts/analyse_logs.py` (`Bucket`, `Probe` subclasses, `_build_probes`, `_analyse_app`, `main`)	Analyzer streams and de-duplicates lines with a bounded 50k sliding window, strips ANSI, best-effort extracts ISO timestamps (JSON-first or leading token), parses embedded JSON for `level`/`msg`, feeds probes (severity, panic, HTTP status, latency, heartbeat, regex buckets), isolates per-probe exceptions to stderr, and records per-label `count`, `first_seen`, `last_seen`, `peak`; latency/heartbeat probes compute percentiles/median and include sample lists.
Output semantics & exit codes `scripts/analyse_logs.py`, `scripts/search_logs.py`, `scripts/filter_since.py`	CLIs validate args and return non-zero on misuse/errors; `analyse_logs.py` returns `1` when no runs match and `2` when `--out` is used with multiple runs. Analysis artifacts are written under each run’s `analysis.md`/`analysis.json` (or to `--out` when a single run).
Minor runtime hardening `scripts/process_logs.py`, `scripts/aggregate_logs.py`	Wraps `__main__` dispatch in `try/KeyboardInterrupt` to exit with code `130` on Ctrl+C.
Docs & config updates `CHANGELOG.md`, `docs/...`, `.claude/commands/monitor.md`, `opencode.json`	Documents `logs.sh` subcommands and new run/output layout, updates examples and monitor template, expands monitored app list, and notes the legacy shim behavior.

Sequence Diagram

sequenceDiagram
    actor User
    participant logs_sh as "logs.sh\n(monitor)"
    participant FlyAPI as "Fly API"
    participant filter as "filter_since.py"
    participant process as "process_logs.py"
    participant analyse as "analyse_logs.py"
    participant Disk as "Disk"

    User->>logs_sh: logs.sh monitor --app myapp
    logs_sh->>Disk: create run directory & metadata

    loop per iteration
        logs_sh->>FlyAPI: fetch logs for myapp
        FlyAPI-->>logs_sh: raw log bytes

        alt Python available
            logs_sh->>filter: filter_since.py (stdin)
            filter-->>logs_sh: filtered new lines
            filter->>Disk: update cursor file
        end

        alt new content
            logs_sh->>process: process_logs.py (iteration)
            process->>Disk: write iteration artifacts
            logs_sh->>logs_sh: aggregate iteration into app run
        end

        alt snapshot interval reached
            logs_sh->>analyse: analyse_logs.py (snapshot)
            analyse->>Disk: write analysis.json / analysis.md
        end
    end

    logs_sh->>analyse: final analysis
    analyse->>Disk: write final analysis.json / analysis.md
    logs_sh-->>User: complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Migrate to slog + Sentry fingerprinting #333 — modifies log-processing scripts and aggregation/csv outputs; overlaps with process/aggregate changes.
Merge main into feature/redis-broker-v2 #335 — touches log-processing tooling and entrypoint/summary behavior similar to changes here.

Poem

🐰 I nibble ANSI, chase timestamps in the night,
I bucket panics, pings, and slow requests by sight,
I dedupe noisy chatter, note the first and last,
I write the snapshots, summaries — then hop home fast,
A rabbit with a logbook, tidy, small, and bright.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.08% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add logs.sh search and analyse tools' directly and specifically describes the main change: introducing new search and analyse subcommands via the logs.sh script.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch work/cranky-bhabha-4ea945

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

supabase · 2026-05-02T01:18:09Z

Updates to Preview Branch (work/cranky-bhabha-4ea945) ↗︎

Deployments	Status	Updated
Database	✅	Sat, 02 May 2026 23:08:28 UTC
Services	✅	Sat, 02 May 2026 23:08:28 UTC
APIs	✅	Sat, 02 May 2026 23:08:28 UTC

Tasks are run on every commit but only new migration files are pushed.
Close and reopen this PR if you want to apply changes from existing seed or migration files.

Tasks	Status	Updated
Configurations	✅	Sat, 02 May 2026 23:08:30 UTC
Migrations	✅	Sat, 02 May 2026 23:08:31 UTC
Seeding	✅	Sat, 02 May 2026 23:08:33 UTC
Edge Functions	✅	Sat, 02 May 2026 23:08:33 UTC

View logs for this Workflow Run ↗︎.
Learn more about Supabase for Git ↗︎.

codecov · 2026-05-02T01:20:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

github-actions · 2026-05-02T01:21:51Z

Release Versions

App patch: v0.34.2 → v0.34.3

Changelog

Added

scripts/logs.sh unified Fly log tool with monitor, search, and analyse
subcommands. Bare logs.sh runs monitor with all defaults (3s capture
across all five Fly apps, 5-minute analyse snapshots, final report at run
end). search greps captured raw logs by keyword/regex across one or more
runs (case-insensitive by default, also reads raw.zip after cleanup).
analyse runs a fixed probe set — severity, panics, HTTP status, latency,
heartbeat, process health, autoscaler, database/external errors, Sentry, plus
ad-hoc keywords — and writes analysis.{md,json} with first_seen,
last_seen, and peak timestamps for every finding.
scripts/filter_since.py per-app cursor filter wired into capture_app so
each iteration only persists log lines newer than the previous capture,
eliminating the 4× overlap inflation that came from flyctl logs --no-tail.

Changed

scripts/monitor_logs.sh reduced to a back-compat shim that forwards all
flags to logs.sh monitor. Existing call sites in opencode.json,
.claude/commands/monitor.md, .claude/settings.local.json, and
docs/development/DEVELOPMENT.md continue to work unchanged.
Default monitor app list now includes the two autoscaler sidecars
(hover-autoscaler-worker, hover-autoscaler-analysis).
Default --run-id is auto-generated as a <adjective>-<colour> slug (e.g.
mellow-rose) so concurrent runs are easy to distinguish.

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 536-565: When multiple runs resolve and args.out is provided the
loop overwrites the same output file for each run; change the logic so we fail
early when len(runs) > 1 and args.out is set to avoid silent overwrites: add a
guard after resolving runs that checks if len(runs) > 1 and args.out is truthy,
print an explanatory error to sys.stderr and return non‑zero (or require passing
--run), or alternatively implement proper combining by collecting all per-run
app_reports into a single aggregated report before writing; update the code
paths that build report, json_path and md_path (the use of base = Path(args.out)
and base.with_suffix(...)) accordingly so a single output is written only once
with combined content or fail fast to force explicit selection.
- Around line 294-324: The report() currently builds its minute window from
self.minute_info.keys(), so minutes with zero info are never included and the
self.minute_info[m] == 0 check is unreachable; update report() to use the
tracked minute list (self.minute_seen) as the source of minutes (e.g., minutes =
sorted(set(self.minute_seen)) or sorted(self.minute_seen) deduped) and build
counts with self.minute_info.get(m, 0) so zero-count minutes are represented;
compute median from those counts and guard first_minute/last_minute access when
minutes is empty to preserve the existing early-return behavior.

In `@scripts/filter_since.py`:
- Around line 42-50: The loop currently writes lines without timestamps
unconditionally, which re-emits continuation lines when their timestamped header
was dropped by the cursor; add a boolean flag (e.g. last_header_emitted = False)
before the loop and update it inside the timestamp handling in the for-loop that
uses ANSI_RE, LEADING_TS_RE and _normalise: set last_header_emitted = True when
you write/emit a timestamped line (out.write(line)) and set last_header_emitted
= False when you skip a timestamped entry due to the cursor (the branch where ts
<= cursor), then change the untimestamped-line branch to only out.write(line) if
last_header_emitted is True so continuation lines are only passed through after
an emitted header.

In `@scripts/logs.sh`:
- Around line 85-96: The defaults are being set unconditionally which overrides
any environment values; change the hard-coded assignments (e.g., APP, INTERVAL,
SAMPLES, ITERATIONS, RUN_ID, OUTPUT_ROOT, CLEANUP_OLD, CLEANUP_DAYS,
CLEANUP_MODE, ANALYSE_EVERY, PYTHON_CMD, PYTHON_ARGS) to use parameter-expansion
defaults so env vars take precedence (use patterns like VAR="${VAR:-default}" or
the equivalent for arrays) and apply the same fix to the other block mentioned
(lines 134-139) so exported environment overrides actually take effect.
- Around line 141-159: The option parsing loop in scripts/logs.sh uses
constructs like --app, --interval, --samples, --iterations, --run-id,
--analyse-every, --cleanup-days and --cleanup-mode that unconditionally read
"$2" (and shift 2), which fails under set -u when the argument is missing;
update the case arm for each argument-taking option inside the while/case block
to first verify an argument exists (e.g. test $# -ge 2 or that "${2-}" is
non-empty) and call monitor_usage with an error message and exit when missing,
leaving the existing --no-cleanup and -h|--help behavior unchanged so flags
still shift only once or not at all.

In `@scripts/search_logs.py`:
- Around line 137-150: The CLI currently accepts negative --max values which
causes surprising behavior; after parsing args (variable args from the
ArgumentParser instance p) add a validation check that args.max is >= 0 and if
it's negative print a clear error to stderr (e.g., "Invalid --max: must be
non-negative") and return non-zero (2) so the program fails fast; update the
block after args = p.parse_args() and before using args.max in the search loop
to perform this check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 40d23595-88c6-489d-b3aa-42ba0af7097c

📥 Commits

Reviewing files that changed from the base of the PR and between c6288ce and 016a247.

📒 Files selected for processing (5)

scripts/analyse_logs.py
scripts/filter_since.py
scripts/logs.sh
scripts/monitor_logs.sh
scripts/search_logs.py

github-actions · 2026-05-02T01:25:34Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 405-416: The unbounded `seen` set can grow without bound; replace
it with a bounded recent-window dedupe (e.g., implement a fixed-size LRU set) so
only the last N unique lines are tracked and memory can't explode. Concretely,
introduce a constant like RECENT_DEDUPE_WINDOW and swap `seen: set[str] = set()`
for a small LRU structure (deque+set or OrderedDict/collections.deque or a
third-party LRU like cachetools) keyed by the line hash (use
`_strip_ansi(raw_line).rstrip()` or its hash) and on insert evict the oldest
entry when size > RECENT_DEDUPE_WINDOW; keep the rest of the loop logic
(`iter_lines`, `_strip_ansi`, `unique_lines` increment) unchanged.

In `@scripts/filter_since.py`:
- Around line 9-10: Update the module docstring to reflect the actual behavior:
untimestamped continuation lines are not always passed through but are only
emitted when header_emitted is true (i.e., after a qualifying header/timestamp
has been emitted), so revise the docstring wording that currently claims "Lines
without a parseable leading timestamp pass through untouched" to state the
conditional behavior tied to the header_emitted flag and the emission logic in
the main filter loop that checks header_emitted before printing continuation
lines.

In `@scripts/logs.sh`:
- Around line 315-325: The current branches that write flyctl output directly to
"$raw_file" (the two places invoking flyctl logs | tail -n "$SAMPLES" and the
one piping to filter_since.py) must be changed to write to a temporary file, run
a redaction step, and only persist the sanitized output to "$raw_file"; update
both code paths that call flyctl logs and the invocation that currently pipes to
filter_since.py (referencing "$raw_file", "filter_since.py", "cursor_file",
"SAMPLES", "PYTHON_CMD", "PYTHON_ARGS") so that: 1) capture goes to a temp path,
2) run a redactor (either an existing redact script/function or a new
redact_logs step) that strips secrets/JWTs/PII, 3) write the redacted content to
"$raw_file" and log that raw output was sanitized, and 4) avoid storing
unredacted logs unless an explicit safe-by-default override is set (e.g.,
SAFE_PERSIST_RAW=true) and gated by a strict flag check. Ensure both code
branches use the same temp→redact→persist flow.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4ee33696-a858-4b14-b541-f5d8e7c12136

📥 Commits

Reviewing files that changed from the base of the PR and between fb9176a and 00366d0.

📒 Files selected for processing (4)

scripts/analyse_logs.py
scripts/filter_since.py
scripts/logs.sh
scripts/search_logs.py

✅ Files skipped from review due to trivial changes (1)

scripts/search_logs.py

github-actions · 2026-05-02T01:34:11Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

🧹 Nitpick comments (2)

scripts/analyse_logs.py (2)
145-145: 💤 Low value

Consider renaming loop variable l to lvl or level.

The static analyzer flags l as ambiguous (could be confused with 1). This appears in several list comprehensions (lines 145, 166, 192, 226). While not a bug, more descriptive names improve readability.
♻️ Example fix for line 145
-        findings = [self.buckets[l].to_dict(l) for l in self.LEVELS if self.buckets[l].count]
+        findings = [self.buckets[lvl].to_dict(lvl) for lvl in self.LEVELS if self.buckets[lvl].count]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` at line 145, Rename the ambiguous loop variable `l`
used in list comprehensions to a clearer name such as `level` (or `lvl`)
wherever it appears (e.g., in the comprehension that builds `findings =
[self.buckets[l].to_dict(l) for l in self.LEVELS if self.buckets[l].count]`),
update the variable inside the expression and its use when calling
`self.buckets[...].to_dict(...)`, and apply the same rename to the other similar
comprehensions that iterate over `self.LEVELS` (the occurrences that build other
lists and call `to_dict` on `self.buckets`) so the code uses `level`
consistently for readability.
319-319: Consider using itertools.pairwise() for successive pairs if Python 3.10+ is the minimum version.

itertools.pairwise() is more idiomatic than zip(minutes, minutes[1:]) for this pattern, but it requires Python 3.10+. If the minimum supported version is 3.10+, consider the refactoring below. Otherwise, the current zip() approach works correctly.
Suggested refactoring (Python 3.10+ only)
+from itertools import pairwise
 ...
-        for prev, curr in zip(minutes, minutes[1:]):
+        for prev, curr in pairwise(minutes):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` at line 319, Replace the manual sliding-window zip
pattern with itertools.pairwise when the project targets Python 3.10+: change
the loop that currently reads "for prev, curr in zip(minutes, minutes[1:]):" to
use itertools.pairwise(minutes) and add "from itertools import pairwise" at the
top of the module; keep the loop body unchanged so it still processes successive
(prev, curr) pairs from the minutes sequence.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/analyse_logs.py`:
- Line 145: Rename the ambiguous loop variable `l` used in list comprehensions
to a clearer name such as `level` (or `lvl`) wherever it appears (e.g., in the
comprehension that builds `findings = [self.buckets[l].to_dict(l) for l in
self.LEVELS if self.buckets[l].count]`), update the variable inside the
expression and its use when calling `self.buckets[...].to_dict(...)`, and apply
the same rename to the other similar comprehensions that iterate over
`self.LEVELS` (the occurrences that build other lists and call `to_dict` on
`self.buckets`) so the code uses `level` consistently for readability.
- Line 319: Replace the manual sliding-window zip pattern with
itertools.pairwise when the project targets Python 3.10+: change the loop that
currently reads "for prev, curr in zip(minutes, minutes[1:]):" to use
itertools.pairwise(minutes) and add "from itertools import pairwise" at the top
of the module; keep the loop body unchanged so it still processes successive
(prev, curr) pairs from the minutes sequence.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2b7f9fef-1f19-4edb-9796-cf4ebbf31117

📥 Commits

Reviewing files that changed from the base of the PR and between 00366d0 and 072e00b.

📒 Files selected for processing (2)

scripts/analyse_logs.py
scripts/filter_since.py

🚧 Files skipped from review as they are similar to previous changes (1)

scripts/filter_since.py

github-actions · 2026-05-02T01:49:30Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

github-actions · 2026-05-02T02:00:52Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

🧹 Nitpick comments (2)

scripts/analyse_logs.py (2)

272-274: 💤 Low value

Remove unnecessary int() cast.

round() already returns an integer in Python 3 when called with a single argument.

♻️ Suggested fix

         def pct(p: float) -> float:
-            i = max(0, min(n - 1, int(round(p / 100 * (n - 1)))))
+            i = max(0, min(n - 1, round(p / 100 * (n - 1))))
             return vs[i]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 272 - 274, The pct function
unnecessarily casts round() to int; update pct(p: float) -> float to remove the
redundant int() call by using i = max(0, min(n - 1, round(p / 100 * (n - 1))))
and keep returning vs[i]; ensure you still clamp i between 0 and n-1 and that
variable names (pct, p, n, vs) remain unchanged.

323-323: 💤 Low value

Consider using itertools.pairwise() for successive pairs.

This is a cleaner pattern for iterating over consecutive elements. The strict= parameter isn't needed here since the slice guarantees correct pairing.

♻️ Suggested fix

Add import at the top:

from itertools import pairwise

Then update the loop:

-        for prev, curr in zip(minutes, minutes[1:]):
+        for prev, curr in pairwise(minutes):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` at line 323, Replace the manual sliding-window zip
loop for consecutive entries with itertools.pairwise: add "from itertools import
pairwise" at the top, then replace the "for prev, curr in zip(minutes,
minutes[1:]):" loop with "for prev, curr in pairwise(minutes):" to iterate
successive pairs of the minutes sequence (keep existing loop body unchanged).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/analyse_logs.py`:
- Around line 272-274: The pct function unnecessarily casts round() to int;
update pct(p: float) -> float to remove the redundant int() call by using i =
max(0, min(n - 1, round(p / 100 * (n - 1)))) and keep returning vs[i]; ensure
you still clamp i between 0 and n-1 and that variable names (pct, p, n, vs)
remain unchanged.
- Line 323: Replace the manual sliding-window zip loop for consecutive entries
with itertools.pairwise: add "from itertools import pairwise" at the top, then
replace the "for prev, curr in zip(minutes, minutes[1:]):" loop with "for prev,
curr in pairwise(minutes):" to iterate successive pairs of the minutes sequence
(keep existing loop body unchanged).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4c9acc76-7581-4c02-8dbb-9f4ba5e60124

📥 Commits

Reviewing files that changed from the base of the PR and between 072e00b and c2ddc63.

📒 Files selected for processing (1)

scripts/analyse_logs.py

github-actions · 2026-05-02T02:09:03Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

scripts/analyse_logs.py (1)

315-336: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Heartbeat still misses minutes that had traffic but zero info logs.

This loop now detects only missing wall-clock minutes between observed timestamps. A minute with warn/error traffic and zero info lines is still omitted, so the probe doesn't actually cover the “info-level traffic dropped to zero mid-run” case from its own contract.

Suggested fix

-        minutes = sorted(set(self.minute_seen))
-        non_zero = [self.minute_info[m] for m in minutes if self.minute_info[m] > 0]
+        minutes = sorted(set(self.minute_seen))
+        non_zero = [self.minute_info.get(m, 0) for m in minutes if self.minute_info.get(m, 0) > 0]
         median = statistics.median(non_zero) if non_zero else 0
         gaps: list[dict] = []
-        # Walk consecutive observed minutes and flag any wall-clock minute that
-        # fell entirely between them with no log lines. Without this we'd never
-        # report a gap, since `minute_seen` only contains minutes that *had*
-        # traffic.
         if median >= 1 and len(minutes) >= 2:
+            for m in minutes[1:-1]:
+                if self.minute_info.get(m, 0) == 0 and len(gaps) < 200:
+                    gaps.append({"minute": m, "expected_min": int(median)})
             for prev, curr in pairwise(minutes):
                 try:
                     p = datetime.strptime(prev, "%Y-%m-%dT%H:%M")
                     c = datetime.strptime(curr, "%Y-%m-%dT%H:%M")
                 except ValueError:
                     continue
                 step = p + timedelta(minutes=1)
                 while step < c and len(gaps) < 200:
-                    gaps.append({
-                        "minute": step.strftime("%Y-%m-%dT%H:%M"),
-                        "expected_min": int(median),
-                    })
+                    minute = step.strftime("%Y-%m-%dT%H:%M")
+                    if all(g["minute"] != minute for g in gaps):
+                        gaps.append({"minute": minute, "expected_min": int(median)})
                     step += timedelta(minutes=1)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 315 - 336, The code only flags
wall-clock minutes missing entirely between observed minutes but ignores minutes
that were observed in minute_seen yet have zero info-level lines (minute_info[m]
== 0); update the logic around minute_seen/minute_info to also treat those
zero-info observed minutes as gaps: after computing minutes and median, collect
zero_info = [m for m in minutes if self.minute_info.get(m, 0) == 0] and append
each as a gap entry (with "minute" and "expected_min": int(median)) while
keeping the existing between-minute walk (pairwise(minutes)) and the global cap
(len(gaps) < 200); ensure you deduplicate entries (e.g., via a set) so minutes
found by both methods aren’t added twice.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 563-572: The code unconditionally rejects multi-run selections by
checking only len(runs) > 1; change the guard to only error when multiple runs
would share a single output base (i.e., when an output path was provided).
Replace the condition with something like: if len(runs) > 1 and out is not None
(or the actual variable holding the --out value), update the error message to
mention that --out causes shared output and keep the existing behavior that
allows multi-run processing when --out is unset so the loop below can emit one
report per run.

---

Duplicate comments:
In `@scripts/analyse_logs.py`:
- Around line 315-336: The code only flags wall-clock minutes missing entirely
between observed minutes but ignores minutes that were observed in minute_seen
yet have zero info-level lines (minute_info[m] == 0); update the logic around
minute_seen/minute_info to also treat those zero-info observed minutes as gaps:
after computing minutes and median, collect zero_info = [m for m in minutes if
self.minute_info.get(m, 0) == 0] and append each as a gap entry (with "minute"
and "expected_min": int(median)) while keeping the existing between-minute walk
(pairwise(minutes)) and the global cap (len(gaps) < 200); ensure you deduplicate
entries (e.g., via a set) so minutes found by both methods aren’t added twice.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b3012282-ced8-420e-9401-ceda40c96711

📥 Commits

Reviewing files that changed from the base of the PR and between 072e00b and d6667ad.

📒 Files selected for processing (1)

scripts/analyse_logs.py

github-actions · 2026-05-02T04:32:32Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

github-actions · 2026-05-02T04:49:02Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

🧹 Nitpick comments (1)

.claude/commands/monitor.md (1)
43-53: ⚡ Quick win

Add a language tag to the fenced output-structure block.

Line 43 triggers MD040; adding a fence language keeps docs lint-clean.
💡 Suggested fix
-```
+```text
 logs/YYYYMMDD/HHMM_<slug>_<settings>/
 ├── <app>/raw/*.log          # cursor-filtered captures (one per iteration)
 ├── <app>/.cursor            # last-seen ISO timestamp per app
 ├── snapshots/
 │   ├── analysis_<HHMMSS>Z.md
 │   └── analysis_<HHMMSS>Z.json
 ├── analysis.md              # final probe report
 ├── analysis.json
 └── monitor.log              # verbose run history
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @.claude/commands/monitor.md around lines 43 - 53, The fenced code block in
.claude/commands/monitor.md (the output-structure block starting with
"logs/YYYYMMDD/HHMM__/") is missing a language tag which
triggers MD040; update the opening fence from totext so the block is
explicitly marked as plain text (no other edits to the block content needed).
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.claude/commands/monitor.md:

Around line 43-53: The fenced code block in .claude/commands/monitor.md (the
output-structure block starting with "logs/YYYYMMDD/HHMM__/") is
missing a language tag which triggers MD040; update the opening fence from totext so the block is explicitly marked as plain text (no other edits to
the block content needed).
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `cb9e6ef6-8538-46ec-94dd-30215d66205f`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 4d86dcc621b48417cfd10ed8b5f336febefb9ed6 and c0d25a90172e2504a7b2dfe92f603d2d48b5d0b3.

</details>

<details>
<summary>📒 Files selected for processing (4)</summary>

* `.claude/commands/monitor.md`
* `docs/development/DEVELOPMENT.md`
* `opencode.json`
* `scripts/logs.sh`

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (1)</summary>

* docs/development/DEVELOPMENT.md

</details>

</details>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/monitor.md:
- Around line 62-63: Update the back-compat note so the legacy script invocation
is unambiguous: change the reference from "logs.sh monitor" to
"./scripts/logs.sh monitor" in the sentence that mentions
`scripts/monitor_logs.sh` so it matches the rest of the doc and avoids
copy-paste failures.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 419dc357-ab9c-4d54-8d71-bed47d079a0f

📥 Commits

Reviewing files that changed from the base of the PR and between c0d25a9 and 9ec9129.

📒 Files selected for processing (1)

.claude/commands/monitor.md

github-actions · 2026-05-02T07:28:38Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

github-actions · 2026-05-02T07:40:23Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/logs.sh`:
- Around line 187-194: The script currently only validates APPS[0], allowing
empty app names (e.g., "hover,hover-worker,") to pass; update the
parsing/validation so every parsed app name in the APPS array is trimmed and
checked for emptiness: after IFS=',' read -r -a APPS <<< "$APP" and the trimming
loop, either filter out empty elements (e.g., rebuild APPS with only non-empty
strings) or add a validation loop that exits with an error if any APPS[i] is
empty; ensure subsequent usage (RUN_DIR/... and the loop invoking flyctl logs
--app "$app") skips or errors on empty names; apply the same change to the
duplicate logic around the other occurrence referenced (lines ~244-246) so both
places enforce non-empty app names.
- Around line 338-342: The current cleanup uses iter_files=$(find "$run_dir" ...
) and then printf ... | xargs rm -f which breaks on filenames with spaces;
change to a null-delimited approach: use find "$run_dir" -type f -name
'*_iter*.json' -print0 and feed that directly to xargs -0 rm -f (or test for
files with find ... -print0 | grep -z . >/dev/null) so deletion is safe for
paths with spaces; update the block referencing iter_files, the find invocation,
and the log_to_file call that reports "Removing iteration JSONs:
$date_dir/$run_name" to use this null-delimited pipeline and preserve the
existing failure logging (i.e., pipe to xargs -0 rm -f || log_to_file ...).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 75fe8560-39f1-48a7-b6ef-f5adc90108b7

📥 Commits

Reviewing files that changed from the base of the PR and between ebbcec4 and 3b14997.

📒 Files selected for processing (1)

scripts/logs.sh

github-actions · 2026-05-02T10:00:58Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

scripts/logs.sh (2)

338-342: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use null-delimited cleanup for iteration JSONs.

This find output is marshalled through newline-delimited xargs rm -f, so spaces in OUTPUT_ROOT or an existing run path will split the filename and can skip or mis-target deletions.

Suggested fix

-                    iter_files=$(find "$run_dir" -type f -name '*_iter*.json' 2>/dev/null)
-                    if [[ -n "$iter_files" ]]; then
+                    if find "$run_dir" -type f -name '*_iter*.json' -print -quit 2>/dev/null | grep -q .; then
                         log_to_file "  Removing iteration JSONs: $date_dir/$run_name"
-                        printf '%s\n' "$iter_files" | xargs rm -f || \
+                        find "$run_dir" -type f -name '*_iter*.json' -print0 2>/dev/null | xargs -0 rm -f || \
                             log_to_file "  Failed to remove iteration JSONs in $run_dir"
                     fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/logs.sh` around lines 338 - 342, The current removal of iteration
JSONs collects filenames into iter_files and pipes them to xargs which breaks on
spaces; update the cleanup to use a null-delimited approach: change the find
invocation that sets iter_files (the code manipulating iter_files, run_dir and
the find command) to produce NUL-separated results (e.g., find ... -print0) and
invoke removal with a NUL-aware consumer (e.g., xargs -0 rm -f or find ... -exec
rm -f {} +), and keep the existing log_to_file calls (the messages that
reference date_dir/$run_name and the failure message) intact so failures still
log correctly.

187-194: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject empty app entries after splitting --app.

Line 191 only proves the first element is non-empty. Inputs like --app "hover,hover-worker," still create "$RUN_DIR//raw" and call flyctl logs --app "" on every iteration.

Suggested fix

     IFS=',' read -r -a APPS <<< "$APP"
     for i in "${!APPS[@]}"; do
         APPS[i]="${APPS[i]// /}"
     done
-    if [[ ${`#APPS`[@]} -eq 0 || -z "${APPS[0]}" ]]; then
+    if [[ ${`#APPS`[@]} -eq 0 ]]; then
         echo "at least one app name is required" >&2
         exit 1
     fi
+    for app in "${APPS[@]}"; do
+        if [[ -z "$app" ]]; then
+            echo "app list contains an empty value; check comma placement" >&2
+            exit 1
+        fi
+    done

Also applies to: 244-246

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/logs.sh` around lines 187 - 194, After splitting into APPS with
IFS=',' and trimming spaces in the existing for i in "${!APPS[@]}" loop, filter
out empty entries so blank elements (e.g., trailing commas) are removed before
further use; update the loop that currently does APPS[i]="${APPS[i]// /}" to
instead trim whitespace and discard any APPS[i] that becomes empty (rebuild a
new array of non-empty app names or compact APPS in-place), then update the
subsequent validation (the if [[ ${`#APPS`[@]} -eq 0 || -z "${APPS[0]}" ]]) to
check the filtered array; apply the same filtering fix to the identical
splitting/validation block around the later occurrence (the second APPS split at
lines ~244-246).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/logs.sh`:
- Around line 230-240: Sanitise RUN_ID before using it to build RUN_DIR: ensure
any value coming from generate_run_slug or the --run-id input is stripped of
path-separators and path-traversal tokens (e.g., remove or replace '/' and '\'
and disallow '..'), normalise to a safe character set (e.g., alphanumerics,
hyphen, underscore) and fall back to a safe slug if the result is empty; then
use that sanitized variable when composing DATE_DIR, TIME_PREFIX and RUN_DIR
(references: RUN_ID, generate_run_slug, SETTINGS_SUFFIX, RUN_DIR, DATE_DIR,
OUTPUT_ROOT). Ensure the sanitisation happens immediately after RUN_ID
assignment and before any use in path construction so later cleanup logic can
safely assume runs live exactly two levels below OUTPUT_ROOT.
- Around line 30-43: The resolve_python function currently only tests that
candidates can import sys (which allows Python 2); update it to verify Python 3
by running a single probe that checks sys.version_info[0] >= 3 and reuse that
probe for all candidates to avoid drift; specifically, replace the three
separate import checks with a helper probe (e.g., check_py) that executes a
small Python snippet which exits 0 only if sys.version_info[0] >= 3, and use it
to set PYTHON_CMD and PYTHON_ARGS for "python3", "python", and "py" (using "py
-3" for the latter) while preserving the same variable names (PYTHON_CMD,
PYTHON_ARGS) and error behavior.

---

Duplicate comments:
In `@scripts/logs.sh`:
- Around line 338-342: The current removal of iteration JSONs collects filenames
into iter_files and pipes them to xargs which breaks on spaces; update the
cleanup to use a null-delimited approach: change the find invocation that sets
iter_files (the code manipulating iter_files, run_dir and the find command) to
produce NUL-separated results (e.g., find ... -print0) and invoke removal with a
NUL-aware consumer (e.g., xargs -0 rm -f or find ... -exec rm -f {} +), and keep
the existing log_to_file calls (the messages that reference date_dir/$run_name
and the failure message) intact so failures still log correctly.
- Around line 187-194: After splitting into APPS with IFS=',' and trimming
spaces in the existing for i in "${!APPS[@]}" loop, filter out empty entries so
blank elements (e.g., trailing commas) are removed before further use; update
the loop that currently does APPS[i]="${APPS[i]// /}" to instead trim whitespace
and discard any APPS[i] that becomes empty (rebuild a new array of non-empty app
names or compact APPS in-place), then update the subsequent validation (the if
[[ ${`#APPS`[@]} -eq 0 || -z "${APPS[0]}" ]]) to check the filtered array; apply
the same filtering fix to the identical splitting/validation block around the
later occurrence (the second APPS split at lines ~244-246).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d8edd0eb-2151-48fe-afda-c917ba504f95

📥 Commits

Reviewing files that changed from the base of the PR and between ebbcec4 and 3b14997.

📒 Files selected for processing (1)

scripts/logs.sh

coderabbitai

♻️ Duplicate comments (4)

scripts/logs.sh (4)

187-194: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate every parsed app name, not just APPS[0].

Inputs like --app "hover,hover-worker," still pass here, then create "$RUN_DIR//raw" and later call flyctl logs --app "". Reject empty entries after trimming, or rebuild APPS without them.

Suggested fix

     IFS=',' read -r -a APPS <<< "$APP"
     for i in "${!APPS[@]}"; do
         APPS[i]="${APPS[i]// /}"
     done
-    if [[ ${`#APPS`[@]} -eq 0 || -z "${APPS[0]}" ]]; then
+    if [[ ${`#APPS`[@]} -eq 0 ]]; then
         echo "at least one app name is required" >&2
         exit 1
     fi
+    for app_name in "${APPS[@]}"; do
+        if [[ -z "$app_name" ]]; then
+            echo "app list contains an empty value; check comma placement" >&2
+            exit 1
+        fi
+    done

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/logs.sh` around lines 187 - 194, The current validation only checks
APPS[0], allowing empty entries like trailing commas; after splitting into APPS
(IFS=',' read -r -a APPS) and trimming (the loop that does APPS[i]="${APPS[i]//
/}"), filter out any empty strings by rebuilding APPS to include only non-empty
elements (e.g., iterate over APPS and append only if not empty) and then perform
the required emptiness check on the rebuilt APPS array so inputs like
"hover,hover-worker," are rejected and no empty app names are used later.

338-342: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a null-delimited cleanup pipeline here.

This find → plain xargs rm -f path splits on whitespace, so runs with spaces in their path can leave iteration JSON behind or hand broken tokens to rm.

Suggested fix

-                    iter_files=$(find "$run_dir" -type f -name '*_iter*.json' 2>/dev/null)
-                    if [[ -n "$iter_files" ]]; then
+                    if find "$run_dir" -type f -name '*_iter*.json' -print -quit 2>/dev/null | grep -q .; then
                         log_to_file "  Removing iteration JSONs: $date_dir/$run_name"
-                        printf '%s\n' "$iter_files" | xargs rm -f || \
+                        find "$run_dir" -type f -name '*_iter*.json' -print0 2>/dev/null | xargs -0 rm -f || \
                             log_to_file "  Failed to remove iteration JSONs in $run_dir"
                     fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/logs.sh` around lines 338 - 342, The cleanup pipeline fails on paths
with whitespace because iter_files is fed to plain xargs; change the logic
around the iter_files check to use a null-delimited find output (e.g., replace
collecting into iter_files and printf ... | xargs rm -f with a null-safe
pipeline using find ... -print0 piped to xargs -0 rm -f, or iterate with while
IFS= read -r -d '' file; do rm -f "$file"; done < <(find "$run_dir" -type f
-name '*_iter*.json' -print0)); keep the existing log_to_file invocations (e.g.,
the "Removing iteration JSONs: $date_dir/$run_name" and failure log) and ensure
you still only run the removal when matching files exist.

230-240: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitise --run-id before composing RUN_DIR.

RUN_ID is interpolated straight into a filesystem path. Values containing /, \, or .. can escape the advertised logs/YYYYMMDD/HHMM_<slug>_<settings>/ layout and confuse the later cleanup logic that assumes runs sit exactly two levels below "$OUTPUT_ROOT".

Suggested fix

+    if [[ -n "$RUN_ID" && ! "$RUN_ID" =~ ^[A-Za-z0-9._-]+$ ]]; then
+        echo "run-id may only contain letters, numbers, dot, underscore, and hyphen" >&2
+        exit 1
+    fi
+
     if [[ -z "$RUN_ID" ]]; then
         RUN_ID="$(generate_run_slug)_${SETTINGS_SUFFIX}"
     else
         RUN_ID="${RUN_ID}_${SETTINGS_SUFFIX}"
     fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/logs.sh` around lines 230 - 240, Sanitise RUN_ID before composing
RUN_DIR: ensure the value produced by generate_run_slug or supplied via --run-id
is stripped of path traversal characters and separators (like '/' and '\' and
any '..' segments) and normalized to a safe slug (e.g., remove/replace those
characters and collapse repeated dots) so it cannot escape OUTPUT_ROOT; then use
that sanitized variable when building RUN_DIR (referencing RUN_ID and RUN_DIR in
the snippet) and ensure it never becomes empty (fallback to a safe default if
needed).

30-43: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require a real Python 3 probe in both interpreter selectors.

python -c "import sys" also succeeds on Python 2, so search, analyse, and the monitor-side helpers can still bind to an unsupported interpreter and fail later. Please gate both blocks on sys.version_info[0] >= 3, ideally via one shared probe so they cannot drift again.

Suggested fix

+check_python3() {
+    "$@" - <<'PY' >/dev/null 2>&1
+import sys
+raise SystemExit(0 if sys.version_info[0] >= 3 else 1)
+PY
+}
+
 resolve_python() {
-    if command -v python3 >/dev/null 2>&1 && python3 -c "import sys" >/dev/null 2>&1; then
+    if command -v python3 >/dev/null 2>&1 && check_python3 python3; then
         PYTHON_CMD="python3"
         PYTHON_ARGS=()
-    elif command -v python >/dev/null 2>&1 && python -c "import sys" >/dev/null 2>&1; then
+    elif command -v python >/dev/null 2>&1 && check_python3 python; then
         PYTHON_CMD="python"
         PYTHON_ARGS=()
-    elif command -v py >/dev/null 2>&1 && py -3 -c "import sys" >/dev/null 2>&1; then
+    elif command -v py >/dev/null 2>&1 && check_python3 py -3; then
         PYTHON_CMD="py"
         PYTHON_ARGS=(-3)
     else
         echo "Python 3 is required for this command but was not found." >&2
         exit 1
@@
-    if command -v python3 >/dev/null 2>&1 && python3 -c "import sys" >/dev/null 2>&1; then
+    if command -v python3 >/dev/null 2>&1 && check_python3 python3; then
         PYTHON_CMD="python3"
-    elif command -v python >/dev/null 2>&1 && python -c "import sys" >/dev/null 2>&1; then
+    elif command -v python >/dev/null 2>&1 && check_python3 python; then
         PYTHON_CMD="python"
-    elif command -v py >/dev/null 2>&1 && py -3 -c "import sys" >/dev/null 2>&1; then
+    elif command -v py >/dev/null 2>&1 && check_python3 py -3; then
         PYTHON_CMD="py"
         PYTHON_ARGS=(-3)
     fi

Also applies to: 196-203

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/logs.sh` around lines 30 - 43, In resolve_python(), the current
probes (python3 -c "import sys" and python -c "import sys") can succeed on
Python 2; change both checks to validate major version >=3 (e.g. replace the
probe commands with python3 -c 'import sys; sys.exit(0 if sys.version_info[0] >=
3 else 1)' and python -c 'import sys; sys.exit(0 if sys.version_info[0] >= 3
else 1)') so PYTHON_CMD/PYTHON_ARGS only bind to a true Python 3 interpreter;
alternatively factor this into a single shared probe helper used by the python3
and python branches (referencing resolve_python, PYTHON_CMD, PYTHON_ARGS) and
apply the same pattern for the py branch as needed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/logs.sh`:
- Around line 187-194: The current validation only checks APPS[0], allowing
empty entries like trailing commas; after splitting into APPS (IFS=',' read -r
-a APPS) and trimming (the loop that does APPS[i]="${APPS[i]// /}"), filter out
any empty strings by rebuilding APPS to include only non-empty elements (e.g.,
iterate over APPS and append only if not empty) and then perform the required
emptiness check on the rebuilt APPS array so inputs like "hover,hover-worker,"
are rejected and no empty app names are used later.
- Around line 338-342: The cleanup pipeline fails on paths with whitespace
because iter_files is fed to plain xargs; change the logic around the iter_files
check to use a null-delimited find output (e.g., replace collecting into
iter_files and printf ... | xargs rm -f with a null-safe pipeline using find ...
-print0 piped to xargs -0 rm -f, or iterate with while IFS= read -r -d '' file;
do rm -f "$file"; done < <(find "$run_dir" -type f -name '*_iter*.json'
-print0)); keep the existing log_to_file invocations (e.g., the "Removing
iteration JSONs: $date_dir/$run_name" and failure log) and ensure you still only
run the removal when matching files exist.
- Around line 230-240: Sanitise RUN_ID before composing RUN_DIR: ensure the
value produced by generate_run_slug or supplied via --run-id is stripped of path
traversal characters and separators (like '/' and '\' and any '..' segments) and
normalized to a safe slug (e.g., remove/replace those characters and collapse
repeated dots) so it cannot escape OUTPUT_ROOT; then use that sanitized variable
when building RUN_DIR (referencing RUN_ID and RUN_DIR in the snippet) and ensure
it never becomes empty (fallback to a safe default if needed).
- Around line 30-43: In resolve_python(), the current probes (python3 -c "import
sys" and python -c "import sys") can succeed on Python 2; change both checks to
validate major version >=3 (e.g. replace the probe commands with python3 -c
'import sys; sys.exit(0 if sys.version_info[0] >= 3 else 1)' and python -c
'import sys; sys.exit(0 if sys.version_info[0] >= 3 else 1)') so
PYTHON_CMD/PYTHON_ARGS only bind to a true Python 3 interpreter; alternatively
factor this into a single shared probe helper used by the python3 and python
branches (referencing resolve_python, PYTHON_CMD, PYTHON_ARGS) and apply the
same pattern for the py branch as needed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9522d76b-59e5-4f58-9a91-5edd11b4ea3c

📥 Commits

Reviewing files that changed from the base of the PR and between ebbcec4 and 3b14997.

📒 Files selected for processing (1)

scripts/logs.sh

github-actions · 2026-05-02T12:05:43Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

🧹 Nitpick comments (1)

scripts/analyse_logs.py (1)
20-20: 💤 Low value

Unused import: zipfile.

The zipfile module is imported but never used in this file — iter_lines() from search_logs.py already handles .zip extraction internally.
♻️ Suggested fix
-import zipfile
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` at line 20, Remove the unused import of the zipfile
module from scripts/analyse_logs.py; since zip extraction is already handled by
iter_lines() in search_logs.py, delete the line importing zipfile to avoid an
unused import warning and keep imports minimal.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/analyse_logs.py`:
- Line 20: Remove the unused import of the zipfile module from
scripts/analyse_logs.py; since zip extraction is already handled by iter_lines()
in search_logs.py, delete the line importing zipfile to avoid an unused import
warning and keep imports minimal.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3e6bd6ac-a100-4e48-a181-dfe2df539b90

📥 Commits

Reviewing files that changed from the base of the PR and between 3b14997 and a63be26.

📒 Files selected for processing (6)

scripts/aggregate_logs.py
scripts/analyse_logs.py
scripts/filter_since.py
scripts/logs.sh
scripts/process_logs.py
scripts/search_logs.py

🚧 Files skipped from review as they are similar to previous changes (2)

scripts/filter_since.py
scripts/search_logs.py

github-actions · 2026-05-02T12:10:39Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 39-59: The function _iso_seconds currently returns raw[:19] inside
the except ValueError when parsing a record key, which prevents trying other
ISO_KEYS and the LEADING_TS_RE fallback; change the except handler inside the
rec loop so it does not return but continues to the next key (i.e., on
ValueError do nothing/continue), leaving the leading-timestamp block as the
ultimate fallback that can still return raw[:19] if its parse fails; update only
the try/except in the ISO_KEYS iteration (refer to ISO_KEYS and _iso_seconds) so
malformed record timestamps don't short-circuit other sources.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5a8575bd-7795-42a2-a4aa-f12b2ba47d99

📥 Commits

Reviewing files that changed from the base of the PR and between a63be26 and 74e69a8.

📒 Files selected for processing (1)

scripts/analyse_logs.py

github-actions · 2026-05-02T12:19:00Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 242-252: The analyzer currently appends every latency to
self.values_ms and self.slowest in _record, unboundedly growing memory; replace
the unbounded list with a bounded streaming structure (e.g., a fixed-size
reservoir sampler, TDigest, or hdrhistogram) and keep self.slowest as a bounded
top-N heap. Change _record to feed latencies into the chosen streaming sketch
(instead of self.values_ms.append) and to push into a min-heap of size N for
slowest entries (pop smallest when exceeding N). Update any percentile
calculation code that currently reads self.values_ms to query the new sketch API
(e.g., tdigest.quantile() or histogram.percentile()) so percentiles are computed
from the bounded structure. Ensure the top-N logic references the same symbols
(self.slowest, _record) so tests/consumers remain compatible.
- Around line 324-338: The current gap detection treats the first and last
elements of minutes as full windows and flags them when self.minute_info.get(m,
0) == 0, causing false positives; change the zero-info loop so when median >= 1
you only check interior minutes (skip the first and last observed minute, e.g.
iterate minutes[1:-1] when len(minutes) >= 3) and add those to gap_minutes,
while leaving the pairwise block (function pairwise + datetime/timedelta fill)
unchanged so missing minutes between observed timestamps are still filled.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b9737ab9-9662-40aa-9217-8f57990e1e35

📥 Commits

Reviewing files that changed from the base of the PR and between 74e69a8 and 0864ea9.

📒 Files selected for processing (1)

scripts/analyse_logs.py

coderabbitai · 2026-05-02T12:20:01Z

+        self.values_ms: list[float] = []
+        self.slowest: list[tuple[float, str, str]] = []  # (ms, ts, line)
+
+    def _record(self, ms: float, ts: str, line: str) -> None:
+        if ms < 0 or ms > 3_600_000:
+            return
+        self.values_ms.append(ms)
+        self.slowest.append((ms, ts, line))
+        if len(self.slowest) > 200:
+            self.slowest.sort(key=lambda x: -x[0])
+            self.slowest = self.slowest[:50]


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Latency percentile collection is still unbounded in memory.

self.values_ms retains one Python float per latency record for the entire app analysis. Because scripts/logs.sh reruns this analyzer on the growing run for periodic snapshots, a high-volume app can turn snapshot/final analysis into an avoidable memory failure. Consider switching percentile calculation to a bounded histogram or streaming sketch instead of materializing every sample.

Also applies to: 268-291

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/analyse_logs.py` around lines 242 - 252, The analyzer currently appends every latency to self.values_ms and self.slowest in _record, unboundedly growing memory; replace the unbounded list with a bounded streaming structure (e.g., a fixed-size reservoir sampler, TDigest, or hdrhistogram) and keep self.slowest as a bounded top-N heap. Change _record to feed latencies into the chosen streaming sketch (instead of self.values_ms.append) and to push into a min-heap of size N for slowest entries (pop smallest when exceeding N). Update any percentile calculation code that currently reads self.values_ms to query the new sketch API (e.g., tdigest.quantile() or histogram.percentile()) so percentiles are computed from the bounded structure. Ensure the top-N logic references the same symbols (self.slowest, _record) so tests/consumers remain compatible.

github-actions · 2026-05-02T21:25:27Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

scripts/analyse_logs.py (1)
241-248: ⚠️ Potential issue | 🟠 Major

Keep latency percentile state bounded.

self.values_ms still retains one float per latency sample for the full app analysis. Because this script is rerun against the growing run for periodic snapshots, high-volume runs can still fail from memory growth even though self.slowest is trimmed. Please switch percentile calculation to a bounded histogram/sketch instead of materializing every sample.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 241 - 248, The code keeps every latency
sample in self.values_ms which causes unbounded memory growth; change the
implementation so _record no longer appends raw floats but instead updates a
bounded histogram/sketch data structure (e.g., fixed-size bucket histogram,
t-digest, or a reservoir/two-phase sketch) and maintain total_count and bucket
counts; update __init__ to initialize the chosen structure (replace
self.values_ms: list[float] with something like self._histogram/buckets and
self._total_count) and modify _record(ms, ts, line) to validate ms, increment
the appropriate bucket and possibly update self.slowest, and add a percentile()
helper that computes percentiles by walking buckets (or querying the sketch);
ensure no code path retains per-sample floats so memory usage stays bounded
while percentiles remain computable.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 262-266: The current logic infers units from the magnitude of
rec.get("duration") (variable d) and misinterprets values like 500_000; change
it to ignore ambiguous bare "duration" values unless an explicit unit key is
present: check rec for explicit keys such as "duration_ms", "duration_ns",
"duration_s" (or other *_ms fields used earlier) and only convert/record when an
explicit unit is available (convert ns->ms or s->ms accordingly), otherwise skip
calling self._record for rec.get("duration"); update the block handling d to
perform these presence checks and conversions instead of magnitude-based
inference.
- Around line 274-276: The pct function currently computes an index via
round(...) which misbehaves for small n (banker's rounding) — update pct(vs:
list, p: float) to first handle empty lists, ensure vs is sorted, and compute
percentiles correctly: return statistics.median(vs) when p == 50, and use
statistics.quantiles(vs, n=100, method="inclusive")[p-1] or implement the
nearest‑rank method for p95/p99 to avoid round() biases; keep references to the
existing pct function and the vs and n variables and preserve original return
types.

---

Duplicate comments:
In `@scripts/analyse_logs.py`:
- Around line 241-248: The code keeps every latency sample in self.values_ms
which causes unbounded memory growth; change the implementation so _record no
longer appends raw floats but instead updates a bounded histogram/sketch data
structure (e.g., fixed-size bucket histogram, t-digest, or a reservoir/two-phase
sketch) and maintain total_count and bucket counts; update __init__ to
initialize the chosen structure (replace self.values_ms: list[float] with
something like self._histogram/buckets and self._total_count) and modify
_record(ms, ts, line) to validate ms, increment the appropriate bucket and
possibly update self.slowest, and add a percentile() helper that computes
percentiles by walking buckets (or querying the sketch); ensure no code path
retains per-sample floats so memory usage stays bounded while percentiles remain
computable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9bbe81a6-32e1-400f-8e21-c8f795bca4e4

📥 Commits

Reviewing files that changed from the base of the PR and between 0864ea9 and ec62db1.

📒 Files selected for processing (1)

scripts/analyse_logs.py

coderabbitai

♻️ Duplicate comments (1)

scripts/analyse_logs.py (1)
241-304: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

LatencyProbe still grows memory unbounded with traffic volume.

At Line [242] and Line [248], every latency sample is retained for the full analysis window. On long/high-volume runs this can exhaust memory and break periodic/final analysis. Please switch percentile tracking to a bounded streaming structure (histogram/sketch/reservoir) and keep only bounded top-N slowest lines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 241 - 304, The LatencyProbe currently
stores every sample in self.values_ms causing unbounded memory growth; replace
the unbounded list with a bounded streaming quantile structure (e.g. a
fixed-size reservoir sampler or a quantile sketch like t-digest) and leave
self.slowest as the bounded top-N list. Concretely: in __init__ replace
self.values_ms: list[float] with a bounded reservoir/sketch instance; update
_record to insert into that reservoir/sketch instead of appending to
self.values_ms (still validate ms range and keep self.slowest/top-N trimming
logic unchanged); update report to compute p50/p95/p99/mean/max from the
reservoir/sketch’s APIs (or, for reservoir, compute percentiles from the
reservoir snapshot) and handle empty-case as before. Use the existing method
names (__init__, _record, feed, report) and fields (self.values_ms -> new
reservoir/sketch, self.slowest) so callers remain compatible.

🧹 Nitpick comments (1)

scripts/analyse_logs.py (1)
418-421: ⚡ Quick win

Handle invalid --regex input as a user error instead of a traceback.

User-supplied regex is compiled directly; invalid patterns currently raise re.error and crash noisily. Add validation with a clear stderr message and exit code 2 (arg error behavior).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 418 - 421, Validate user-provided
regexes in extra_regexes before adding them to patterns: for each r in
extra_regexes attempt re.compile(r) in a try/except block, and if re.error is
raised write a clear error message to stderr (including the offending pattern)
and exit with code 2 (sys.exit(2)); if compilation succeeds continue to append
(f"regex:{r}", r) to patterns as before so RegexBucketProbe("Ad-hoc keywords",
patterns) receives the same items but invalid regex inputs are handled as a user
error.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/analyse_logs.py`:
- Around line 241-304: The LatencyProbe currently stores every sample in
self.values_ms causing unbounded memory growth; replace the unbounded list with
a bounded streaming quantile structure (e.g. a fixed-size reservoir sampler or a
quantile sketch like t-digest) and leave self.slowest as the bounded top-N list.
Concretely: in __init__ replace self.values_ms: list[float] with a bounded
reservoir/sketch instance; update _record to insert into that reservoir/sketch
instead of appending to self.values_ms (still validate ms range and keep
self.slowest/top-N trimming logic unchanged); update report to compute
p50/p95/p99/mean/max from the reservoir/sketch’s APIs (or, for reservoir,
compute percentiles from the reservoir snapshot) and handle empty-case as
before. Use the existing method names (__init__, _record, feed, report) and
fields (self.values_ms -> new reservoir/sketch, self.slowest) so callers remain
compatible.

---

Nitpick comments:
In `@scripts/analyse_logs.py`:
- Around line 418-421: Validate user-provided regexes in extra_regexes before
adding them to patterns: for each r in extra_regexes attempt re.compile(r) in a
try/except block, and if re.error is raised write a clear error message to
stderr (including the offending pattern) and exit with code 2 (sys.exit(2)); if
compilation succeeds continue to append (f"regex:{r}", r) to patterns as before
so RegexBucketProbe("Ad-hoc keywords", patterns) receives the same items but
invalid regex inputs are handled as a user error.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2711b2d7-1b4e-40e7-96f1-a7d2b502835b

📥 Commits

Reviewing files that changed from the base of the PR and between ec62db1 and 5c8a36b.

📒 Files selected for processing (1)

scripts/analyse_logs.py

github-actions · 2026-05-02T21:47:19Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

github-actions · 2026-05-02T22:31:19Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

scripts/analyse_logs.py (1)
241-249: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Bound latency sample storage before snapshot analysis scales up.

self.values_ms still retains one float per latency event for the full app scan. Because logs.sh monitor re-runs analysis over the growing run every few minutes, a busy app can still push snapshot/final analysis into avoidable memory failure.

Also applies to: 270-304
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 241 - 249, The code currently appends
every latency into self.values_ms and self.slowest, causing unbounded memory
growth; modify __init__ and _record to keep bounded state: introduce constants
like SAMPLE_SIZE (e.g. 10000) and SLOWEST_KEEP (e.g. 100) and implement
reservoir sampling for self.values_ms to maintain a fixed-size random sample
instead of appending every ms, and maintain self.slowest as a fixed-size
min-heap (use heapq) that pushes (ms, ts, line) and pops the smallest when
length > SLOWEST_KEEP so only the top-N slowest are retained; apply the same
bounded-sampling logic to the other recording code paths referenced (the code
around the other recording block) so all latency collectors use the same
SAMPLE_SIZE and SLOWEST_KEEP behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/analyse_logs.py`:
- Around line 20-24: The import of itertools.pairwise will fail on Python <3.10;
replace the direct import with a guarded import and local fallback: attempt
"from itertools import pairwise" and if ImportError define a local
pairwise(iterable) generator (yield consecutive pairs using an iterator and
next) so existing uses (e.g., where pairwise is called in the script) continue
to work on older Pythons; ensure the fallback has the same signature and
behavior as itertools.pairwise so functions like main() and other callers remain
unchanged.

In `@scripts/logs.sh`:
- Around line 599-603: The aggregate step is unguarded: wrap each invocation of
aggregate_logs.py (including the occurrences near the per-iteration call and the
final pass at the end of the loop) with an if-check that captures its exit
status, and on failure call log_to_file with a clear message like "[$app] Failed
to aggregate logs (see output above)" and then return (for
per-iteration/capture_app context) or exit/handle appropriately for the final
pass; update the invocations in the same function that runs capture_app and the
final aggregation to follow the same pattern so failures don't silently kill
background jobs or abort without a targeted log line, referencing the
aggregate_logs.py calls and the surrounding capture_app/wait "$pid" logic to
place the checks.

---

Duplicate comments:
In `@scripts/analyse_logs.py`:
- Around line 241-249: The code currently appends every latency into
self.values_ms and self.slowest, causing unbounded memory growth; modify
__init__ and _record to keep bounded state: introduce constants like SAMPLE_SIZE
(e.g. 10000) and SLOWEST_KEEP (e.g. 100) and implement reservoir sampling for
self.values_ms to maintain a fixed-size random sample instead of appending every
ms, and maintain self.slowest as a fixed-size min-heap (use heapq) that pushes
(ms, ts, line) and pops the smallest when length > SLOWEST_KEEP so only the
top-N slowest are retained; apply the same bounded-sampling logic to the other
recording code paths referenced (the code around the other recording block) so
all latency collectors use the same SAMPLE_SIZE and SLOWEST_KEEP behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 635467e0-dd89-425e-a98b-aa78efc514bd

📥 Commits

Reviewing files that changed from the base of the PR and between ec62db1 and adfde5d.

📒 Files selected for processing (2)

scripts/analyse_logs.py
scripts/logs.sh

github-actions · 2026-05-02T23:11:43Z

🐝 Review App Deployed

Homepage: https://hover-pr-373.fly.dev
Dashboard: https://hover-pr-373.fly.dev/dashboard

coderabbitai

♻️ Duplicate comments (1)

scripts/analyse_logs.py (1)
241-304: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Bound latency sample storage to prevent analysis memory blow-ups.

LatencyProbe still retains every latency in self.values_ms, which can grow without bound on long/high-volume runs and fail periodic/final analysis. Please switch percentile calculation to a bounded streaming structure (for example, histogram/t-digest/reservoir) and keep slowest entries in a fixed-size top-N heap.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/analyse_logs.py` around lines 241 - 304, The LatencyProbe currently
appends every sample to self.values_ms and self.slowest; change this to bounded
structures: replace unbounded self.values_ms with a fixed-size reservoir (e.g.
reservoir_size constant, maintain a reservoir list and a counter and use random
replacement in _record) so percentiles in report() are computed from the
reservoir instead of all samples, and replace self.slowest list logic with a
fixed-size min-heap (use heapq) kept in _record to maintain the top-N slowest
entries (push then pop when size exceeds max_slowest); update _record, __init__,
and report to use the new reservoir and heap (compute sorted percentiles from
the reservoir, and produce slowest by popping the heap into descending order).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scripts/analyse_logs.py`:
- Around line 241-304: The LatencyProbe currently appends every sample to
self.values_ms and self.slowest; change this to bounded structures: replace
unbounded self.values_ms with a fixed-size reservoir (e.g. reservoir_size
constant, maintain a reservoir list and a counter and use random replacement in
_record) so percentiles in report() are computed from the reservoir instead of
all samples, and replace self.slowest list logic with a fixed-size min-heap (use
heapq) kept in _record to maintain the top-N slowest entries (push then pop when
size exceeds max_slowest); update _record, __init__, and report to use the new
reservoir and heap (compute sorted percentiles from the reservoir, and produce
slowest by popping the heap into descending order).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6f8f0005-c228-44c4-9f80-4c5de9cf434d

📥 Commits

Reviewing files that changed from the base of the PR and between ec62db1 and b81adff.

📒 Files selected for processing (2)

scripts/analyse_logs.py
scripts/logs.sh

Add logs.sh search and analyse tools

016a247

Add changelog entry for logs.sh

fb9176a